Abstract

The replication crisis has eroded the public’s trust in science. Many famous studies, even published in renowed journals, fail to produce the same results when replicated by other researchers. While this is the outcome of several problems in research, one aspect has gotten critical attention—reproducibility. The term reproducible research refers to studies that contain all materials necessary to reproduce the scientific results by other researchers. This allows other to identify flaws in calculations and improve scientific rigor. In this paper, we show a workflow for reproducible research using the R language and a set of additional packages and tools that simplify a reproducible research procedure.

1 Introduction

The scientific database Scopus lists over 73,000 entries for the search term “reproducible research” at the time of writing this document. The importance of making research reproducible was recognized in the early 1950s in multiple research subjects. And with the reproducibility project the Open Science Foundation (Open Science Collaboration and others 2015) found that merely half of all studies conducted in psychological research can be replicated by other researchers. Several factors have contributed to this problem. From a high level perspective, the pressure to publish and the increase in scientific output has lead to a plethora of findings that will not replicate. Both bad research design and (possibly unintentional) bad research practices have increased the amount of papers that hold little to no value.

2 Problematic Research Practices

One problem that is often mentioned is HARKing (Kerr 1998) or “hypothesizing after results are known”. When multiple statistical tests are conducted with a normal alpha-error rate (e.g., \(\alpha = .05\)), it is expected that some tests will reject the null-hypothesis on mere randomness alone. Hence, the error-rate. If researchers now claim that these findings were their initial hypotheses, results will be indiscernible from randomness. However, this is unknown to the reviewer or reader who only hears about the new hypotheses. HARKing produces findings were there are none. It is thus crucial to determine the research hypothesis before collecting (or analyzing) the data.

Another strategy applied (often without ill intent) is p-hacking (Head et al. 2015). This technique is widespread in scientific publications and probably already is shifting consensus in science. p-hacking refers to techniques that alter the data until the desired p-value is reached. Omitting individual outliers, creating different grouping variables, adding or removing control variables—all these techniques can be considered p-hacking. This process also leads to results that will not hold under replication. It is crucial to show what modifications have been performed on data to evaluate the interpretability of p-values.

When researchers already “massage” the data to attain better p-values, it is additionally bad that many researchers do not understand the meaning of p-values. As Colquhoun (2017) found, many research misinterpret p-values and thus frame their findings much stronger than they really are. Adequate reporting of p-values is thus important to the interpretability of results as well.

Lastly, scientific journals have to problem that they are mostly interested in publishing significant results. Thus contradictory “non-findings” seldom get published in renowned journals. There is little “value” for a researcher to publish non-significant findings, as the additional work to write a manuscript for something like arXiv does often not reap the same reward as a journal publication. This so-called publication bias (Simonsohn, Nelson, and Simmons 2014) worsens the crisis. As now only significant findings are available. It is thus necessary to simplify the process of publishing non-significant results.

3 Reproducible Research Workflows

Many different solutions to this process have been proposed to address these challenges (e.g., (Marwick, Boettiger, and Mullen 2018; Wilson et al. 2017)). However, no uniform process exists that allows creating of documents and alternative reproducibility materials in one workflow.

In this paper, we demonstrate a research workflow based on the R-language and the R Markdown format. This paper was written using this workflow and the sources are freely available online (https://www.osf.io/kcbj5). Our workflow directly addresses the challenge of writing LNCS papers and a companion paper website (https://sumidu.github.io/reproducibleR/) that includes additional material and downloadable data.

In this paper, we will focus on the following aspects:

  • Creating a reproducible research compendium using RMarkdown
  • Using github and the OSF to make research accessible
  • Packages that simplify research in RStudio

4 Writing a Reseach Compendium

(Gentleman and Temple Lang 2007)

4.1 Project Workflow

4.2 Literate programming

  • Creating a research compendium (Gentleman and Temple Lang 2007)
  • Creating a project oriented workflow
  • Literate programming
  • Use of packrat for package management
  • Anonymization and Data replacement using sdcMicro (Templ, Meindl, and Kowarik 2020)
  • Creating LNCS Papers using rmdtemplates (Calero Valdez 2020)

5 Open Data and open code

  • The use of Version control and a public repository Bryan (2018)
  • Creating a project read me using R Markdown
  • Making use of github-pages for companion websites
  • Use of osf for preregistration

6 Automizing builds using drake

library(drake)
readd("hist")

  • Several helpful packages for a reproducible workflow (here (Müller 2017), usethis (Wickham and Bryan 2019), drake (Landau 2020))
  • Several helpful packages for interactive, yet reproducible research in RStudio (citr (Aust 2019), gramr (Dumas, Marwick, and Shotwell 2020), questionr (Barnier, Briatte, and Larmarange 2018), esquisse (Meyer and Perrier 2020))
  • Creating powerful plots using ggstatsplot (Patil and Powell 2019)
  • Create research process plots using DiagrammeR (Iannone 2020)

7 Procedure

Process diagramms as in Figure 7.1 can easily be created using the DiagrammeR (Iannone 2020) Package.

library(DiagrammeR)

grViz(diagram = "
      digraph boxes_and_cicrles {
      
      graph [rankdir = LR]
      
      node [shape = box
            fontname = Helvetica
            ]
      'Setup OSF Project Site'
      Test
      
      node [shape = circle]
      
      Start
      
      edge []
      
      Start->'Setup OSF Project Site';
      'Setup OSF Project Site'->Test;
      }
      "
)

Figure 7.1: Example

7.1 Separation of Analysis and Data-Collection

7.2 Anonymization of Raw Data

Option 1 sdcMicro Option 2 anonymizer

7.3 Preregistration

8 Discussion

9 Data

On this sub-page you can find the data used as a downloadable file (CSV, Excel, or PDF).

data_df <- iris



datatable(data_df, filter = list(position = 'top', clear = TRUE, plain = FALSE), extensions = c('Buttons','FixedColumns'), options = list(
    dom = 'Bfrtip',
    buttons = c('copy', 'csv', 'excel', 'pdf'),
    scrollX = TRUE,
    fixedColumns = TRUE
  ))
#rmdtemplates::line_cite(pkgs) # This creates a single line citing all packages
rmdtemplates::list_cite(pkgs) # This creates a "thightlist" of all packages 

10 Used Packages

We used the following packages to create this document:

  • Package: knitr by Xie (2020)
  • Package: tidyverse by Wickham (2019)
  • Package: rmdformats by Barnier (2019)
  • Package: kableExtra by Zhu (2019)
  • Package: scales by Wickham and Seidel (2019)
  • Package: psych by Revelle (2020)
  • Package: rmdtemplates by Calero Valdez (2020)
  • Package: sdcMicro by Templ, Meindl, and Kowarik (2020)
  • Package: webshot by Chang (2019)
  • Package: here by Müller (2017)
  • Package: DiagrammeR by Iannone (2020)
  • Package: citr by Aust (2019)
  • Package: drake by Landau (2020)
  • Package: esquisse by Meyer and Perrier (2020)
  • Package: usethis by Wickham and Bryan (2019)
  • Package: gramr by Dumas, Marwick, and Shotwell (2020)
  • Package: questionr by Barnier, Briatte, and Larmarange (2018)
  • Package: ggstatsplot by Patil and Powell (2019)

References

Aust, Frederik. 2019. Citr: RStudio Add-in to Insert Markdown Citations. https://CRAN.R-project.org/package=citr.

Barnier, Julien. 2019. Rmdformats: HTML Output Formats and Templates for ’Rmarkdown’ Documents. https://CRAN.R-project.org/package=rmdformats.

Barnier, Julien, François Briatte, and Joseph Larmarange. 2018. Questionr: Functions to Make Surveys Processing Easier. https://CRAN.R-project.org/package=questionr.

Bryan, Jennifer. 2018. “Excuse Me, Do You Have a Moment to Talk About Version Control?” The American Statistician 72 (1): 20–27.

Calero Valdez, André. 2020. Rmdtemplates: Rmdtemplates - an Opinionated Collection of Rmarkdown Templates. https://github.com/statisticsforsocialscience/rmd_templates.

Chang, Winston. 2019. Webshot: Take Screenshots of Web Pages. https://CRAN.R-project.org/package=webshot.

Colquhoun, David. 2017. “The Reproducibility of Research and the Misinterpretation of P-Values.” Royal Society Open Science 4 (12): 171085.

Dumas, Jasmine, Ben Marwick, and Gordon Shotwell. 2020. Gramr: The Grammar of Grammar. https://github.com/ropenscilabs/gramr.

Gentleman, Robert, and Duncan Temple Lang. 2007. “Statistical Analyses and Reproducible Research.” Journal of Computational and Graphical Statistics 16 (1): 1–23.

Head, Megan L, Luke Holman, Rob Lanfear, Andrew T Kahn, and Michael D Jennions. 2015. “The Extent and Consequences of P-Hacking in Science.” PLoS Biology 13 (3): e1002106.

Iannone, Richard. 2020. DiagrammeR: Graph/Network Visualization. https://CRAN.R-project.org/package=DiagrammeR.

Kerr, Norbert L. 1998. “HARKing: Hypothesizing After the Results Are Known.” Personality and Social Psychology Review 2 (3): 196–217.

Landau, William Michael. 2020. Drake: A Pipeline Toolkit for Reproducible Computation at Scale. https://CRAN.R-project.org/package=drake.

Marwick, Ben, Carl Boettiger, and Lincoln Mullen. 2018. “Packaging Data Analytical Work Reproducibly Using R (and Friends).” The American Statistician 72 (1): 80–88.

Meyer, Fanny, and Victor Perrier. 2020. Esquisse: Explore and Visualize Your Data Interactively. https://CRAN.R-project.org/package=esquisse.

Müller, Kirill. 2017. Here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.

Open Science Collaboration, and others. 2015. “Estimating the Reproducibility of Psychological Science.” Science 349 (6251): aac4716.

Patil, Indrajeet, and Chuck Powell. 2019. Ggstatsplot: ’Ggplot2’ Based Plots with Statistical Details. https://CRAN.R-project.org/package=ggstatsplot.

Revelle, William. 2020. Psych: Procedures for Psychological, Psychometric, and Personality Research. https://CRAN.R-project.org/package=psych.

Simonsohn, Uri, Leif D Nelson, and Joseph P Simmons. 2014. “P-Curve and Effect Size: Correcting for Publication Bias Using Only Significant Results.” Perspectives on Psychological Science 9 (6): 666–81.

Templ, Matthias, Bernhard Meindl, and Alexander Kowarik. 2020. SdcMicro: Statistical Disclosure Control Methods for Anonymization of Data and Risk Estimation. https://CRAN.R-project.org/package=sdcMicro.

Wickham, Hadley. 2019. Tidyverse: Easily Install and Load the ’Tidyverse’. https://CRAN.R-project.org/package=tidyverse.

Wickham, Hadley, and Jennifer Bryan. 2019. Usethis: Automate Package and Project Setup. https://CRAN.R-project.org/package=usethis.

Wickham, Hadley, and Dana Seidel. 2019. Scales: Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.

Wilson, Greg, Jennifer Bryan, Karen Cranston, Justin Kitzes, Lex Nederbragt, and Tracy K Teal. 2017. “Good Enough Practices in Scientific Computing.” PLoS Computational Biology 13 (6): e1005510.

Xie, Yihui. 2020. Knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.

Zhu, Hao. 2019. KableExtra: Construct Complex Table with ’Kable’ and Pipe Syntax. https://CRAN.R-project.org/package=kableExtra.